Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vdev_file: implement TRIM-like support for FreeBSD 14 #16496

Closed
wants to merge 2 commits into from

Conversation

robn
Copy link
Member

@robn robn commented Sep 1, 2024

Motivation and Context

Just reading code and noticed FreeBSD support for TRIM to file-backed vdevs was commented out. Looked further and found FreeBSD 14 implemented support for it, so for fun I wired it up.

Description

First, rename vfs_file_fallocate to vfs_file_deallocate, and push the Linux-specific hole-punching flags down to the Linux implementation, and make a FreeBSD no-op implementation.

Then, fill out the FreeBSD implementation with calls to fspacectl() (userspace) or fo_fspacectl() (kernel).

How Has This Been Tested?

Compile checked on Linux 6.1, FreeBSD 13 & 14.

On Linux, zpool_trim and trim test tags continue to pass.

On FreeBSD its a little more complicated. Currently UFS doesn't implement a specific VOP_DEALLOCATE, so the generic fallback is used, which simply writes zeroes. As a result, enabling the test tags just gives a bunch of failed tests because they all look for the effective size on disk, that is, they assume the deallocated regions will become sparse. I don't really want to attempt to change the test suite to use a different filesystem (my FreeBSD test runner is already in tenuous shape as it is).

But, tmpfs(5) does support VOP_DEALLOCATE, so we can faff around a bit to try and prove it that way.

Lets make a tmpfs, and then make a file full of data:

$ mkdir /tmpfs
$ mount -t tmpfs tmpfs /tmpfs
$ dd if=/dev/random of=/tmpfs/file bs=100M count=1

Because its full of data, its effective size is 100K 1K-blocks:

$ du -s /tmpfs/file
102400	/tmpfs/file

I wrote a daft little program, dumpholes, to show the data/hole structure of a file. That also shows that it's all-data:

$ dumpholes /tmpfs/file
DATA 0
END 104857600

So we make a pool over the top, then issue the trim:

$ sudo zpool create tank /tmpfs/file
$ sudo zpool trim -w tank
$ zpool status -t
  pool: tank
 state: ONLINE
config:

	NAME           STATE     READ WRITE CKSUM
	tank           ONLINE       0     0     0
	 /tmpfs/file  ONLINE       0     0     0  (100% trimmed, completed at Sun Sep  1 13:29:55 2024)

errors: No known data errors

For extra fun, we can use dtrace to see that yes, its really doing things:

$ dtrace -n 'vfs::vop_deallocate:entry { ap = (struct vop_deallocate_args *)arg1; printf("off=%x len=%x", *ap->a_offset, *ap->a_len); }'
dtrace: description 'vfs::vop_deallocate:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  0  79177             vop_deallocate:entry off=410000 len=ff0000
  1  79177             vop_deallocate:entry off=1800000 len=c00000
  1  79177             vop_deallocate:entry off=2800000 len=c00000
  1  79177             vop_deallocate:entry off=3800000 len=c00000
  1  79177             vop_deallocate:entry off=4800000 len=c00000

After, we can see the effective size of the file has dropped, and its now full of holes:

$ du -s /tmpfs/file
36980	/tmpfs/file
$ dumpholes /tmpfs/file
DATA 0
HOLE 4313088
DATA 20971520
HOLE 25165824
DATA 37748736
HOLE 41943040
DATA 54525952
HOLE 58720256
DATA 71303168
HOLE 75497472
DATA 88080384
END 104857600

And if we read back part of the zeroed region, it is, indeed, zeroes:

$ xxd -a -s 0x4800000 -l 0xc00000 /tmp/file
04800000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
*
053ffff0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

So I suppose that works?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making it feature-complete!

module/os/freebsd/zfs/zfs_file_os.c Outdated Show resolved Hide resolved
module/os/linux/zfs/zfs_file_os.c Show resolved Hide resolved
module/os/freebsd/zfs/vdev_file.c Show resolved Hide resolved
@robn robn force-pushed the freebsd-file-trim branch from 7f0c1a3 to 3c0c0ef Compare September 4, 2024 12:37
@robn
Copy link
Member Author

robn commented Sep 4, 2024

Last push makes SET_ERROR always get called when any of the methods fail.

module/os/freebsd/zfs/vdev_file.c Show resolved Hide resolved
module/os/freebsd/zfs/zfs_file_os.c Outdated Show resolved Hide resolved
@robn robn force-pushed the freebsd-file-trim branch from 3c0c0ef to 19da2ac Compare September 4, 2024 22:16
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more comments, but otherwise looks good to me.

lib/libzpool/kernel.c Outdated Show resolved Hide resolved
module/os/freebsd/zfs/zfs_file_os.c Outdated Show resolved Hide resolved
@robn robn force-pushed the freebsd-file-trim branch 2 times, most recently from 407a467 to becc671 Compare September 6, 2024 09:09
@robn robn force-pushed the freebsd-file-trim branch from becc671 to bceb3c4 Compare September 16, 2024 09:56
@mcmilk
Copy link
Contributor

mcmilk commented Sep 16, 2024

In a next step we can re-enable TRIM tests on FreeBSD via ZTS.
Maybe PR #16537 is ready now.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With #16537 merged it'd be good to see this rebased on master and the TRIM tests enabled on FreeBSD as @mcmilk suggested.

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Sep 18, 2024
We only use it on a specific way: to punch a hole in (make sparse) a
region of a file, in order to implement TRIM-like behaviour.

So, call the op "deallocate", and move the Linux-style mode flags down
into the Linux implementation, since they're an implementation detail.

FreeBSD gets a no-op stub (for the moment).

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
FreeBSD 14 gained a `VOP_DEALLOCATE` VFS operation and a `fspacectl`
syscall to use it. At minimum, these zero the given region, and if the
underlying filesystem supports it, can make the region sparse. We can
use this to get TRIM-like behaviour for file vdevs.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
@robn robn force-pushed the freebsd-file-trim branch from bceb3c4 to 6e05c1b Compare September 18, 2024 12:45
@robn
Copy link
Member Author

robn commented Sep 18, 2024

Rebased. I haven't enabled the tests, as I couldn't tell if the new runners are using something other than UFS on FreeBSD. See OP; UFS doesn't support hole punching yet.

behlendorf pushed a commit that referenced this pull request Sep 18, 2024
FreeBSD 14 gained a `VOP_DEALLOCATE` VFS operation and a `fspacectl`
syscall to use it. At minimum, these zero the given region, and if the
underlying filesystem supports it, can make the region sparse. We can
use this to get TRIM-like behaviour for file vdevs.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #16496
ptr1337 pushed a commit to CachyOS/zfs that referenced this pull request Nov 14, 2024
We only use it on a specific way: to punch a hole in (make sparse) a
region of a file, in order to implement TRIM-like behaviour.

So, call the op "deallocate", and move the Linux-style mode flags down
into the Linux implementation, since they're an implementation detail.

FreeBSD gets a no-op stub (for the moment).

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#16496
ptr1337 pushed a commit to CachyOS/zfs that referenced this pull request Nov 21, 2024
We only use it on a specific way: to punch a hole in (make sparse) a
region of a file, in order to implement TRIM-like behaviour.

So, call the op "deallocate", and move the Linux-style mode flags down
into the Linux implementation, since they're an implementation detail.

FreeBSD gets a no-op stub (for the moment).

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#16496
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants